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*“ Mathematical Contributions to the Theory of Evolution.—On 
Homotyposis in Homologous but Differentiated Organs.” By 
Karl Pearson, E.E.S., University College, London. De¬ 
ceived January 20,—Read February 19, 1903. 

(1.) In the paper on “Homotyposis in the Vegetable Kingdom,”* I 
defined homotypes as “undifferentiated like organs.” In the course 
of that paper, I endeavoured to indicate that I was not unconscious 
of the influence of age, local environment, and position upon organism 
in modifying homotypic correlation. The object of my memoir, how¬ 
ever, was to obtain some general appreciation of the average intensity 
of individuality in living forms, and to see if it approached the average 
value of fraternal heredity in plant or animal life. For this purpose 
I selected such material as was readily available, indicating the series 
where I thought differentiation of a sensible amount was present owing 
to the age, the situation, or the environment factors. 

From the standpoint of theory, however, we are not compelled to 
adopt a mere indication of this kind. As soon as we can correlate 
between : (a) age and the quantitative character of the homologous 
organs, (b) situation on the organism and this same character, or (c) 
local environment and the character, we can allow for the differentiation 
of homologous parts, or reduce them to pure homotypes. In other 
words, homotyposis can be deduced from differentiated homologous 
parts, if we correct for the differentiation due to (a), ( b ) or (c). The 
test for the existence of such differentiation is simply the presence or 
absence of the corresponding correlation. 

We have accordingly the following problems to find solutions for :—- 
(i.) To find the correction to be made to the apparent homotypic 
correlation, when, the pairs of homologous parts are differentiated from 
each other by their periods of growth. 

(ii.) To find the correction to be made to the apparent homotypic 
correlation, when each pair of homotypes is differentiated by a common 
period of growth from other pairs of homotypes. 

(iii.) To find the correction to be made to the apparent homotypic 
correlation when the pairs of homologous parts are differentiated from 
each other by situation on the organism. 

(iv.) To find the correction to be made to the apparent homotypic 
correlation when each pair of homotypes is differentiated by the 
environment of its organism from other pairs of homotypes. 

It will be seen that in problems (ii) and (iv) we are dealing with 
true homotypes, but that the homotypic factor requires modifying for 
the influence of age or environment on the organism. In (i) and (iii) 

# £ Phil. Trans.,’ A, vol. 197, pp. 285—379. 
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we are not dealing with homotypes at all, but with homologous parts, 
and we wish to reduce them to homotypes by correcting for differ¬ 
ences between them due to growth or to situation on the organism. 

I propose at present to deal only with problems (i) to (iii), not 
because (iv) does not admit of theoretical treatment, but because we 
have not thus far obtained data to illustrate satisfactorily the correlation 
between character and the immediate environment of the individual 
organism. Experimental determinations of homotyposis in plants, 
when the individuals are subjected to a graduated environmental 
scale, e.g ., in depth of soil or quantity of moisture allowed would be 
fairly easy to carry out, and most interesting in result. I hope it may 
be possible to arrange experiments of this kind for the coming sum¬ 
mer. We can then illustrate the fourth proposition from actual obser¬ 
vation, and the publication of its theoretical solution will be of greater 
value. 

(2.) To find the correction to be made to the apparent homotypic correla¬ 
tion when the pair of homologous parts are differentiated from each other 
by their periods of growth . 

Let x and y denote the characters in the two homologous parts 
quantitatively determined, and t h t 2 their respective periods of growth. 
Then we have four variable quantities x, y, t l9 t 2 , no one of which 
fixes absolutely any other, for individuals will have different charac¬ 
ters even with the same period of growth. The proposition accord¬ 
ingly reduces to this : What is the correlation B, between x and y for 
constant values of the variables, i.e., selected values of, h and t 2 1 

This problem is answered in formulae (lviii), (lix) and (lx) of my 
memoir : “On the Influence of Natural Selection on the Variability 
and Correlation of Organs.”* 

Let us write in those formulae h for the subscript 1, t 2 for 2, x for 3, 
and y for 4; we have at once 

_ 9 ; 1 — r t A u ~~ -- Txtfi +• r x t x r x t 2 /-\ 

' r * - - . w. 

O 1 — r tytf ”* r ytf ~ + ^ r t x t 2 r yt 2 T tfh (\\\ 

-.. 

_ r xy { 1 - r tx t?) - Yxt x v v t x - r cch r yk + r tl t 2 (r xh r yt% -f* r yh r xti ) 

<r x <r y - ■ 1_2 v'* 

2 

Now if we deal with direct and not eross-homotyposis, i.e., with the 
correlation of the same character in two homologous parts, we can put 
these results more simply. We in this case render our correlation 
tables symmetrical by entering each one of a pair of homologues first 
as an x and then as a y. We may then write 
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'f'xty — r y t 9 — v , r x t 2 — r,ji } — v , 


and we find 


v 2 __ y 2 

-a; — 


Txy = p, r hh = r, 0 -- 
1 - r 2 - r 2 — r' 2 + 2r rr 


} X — u ?/j 


r2. J 


1 -r 2 


K « 


1 - r 2 2 rr' - r (r 2 + r' 2 ) 

^ 1 — r 2 — r 2 — r' 2 + 2 m*' 1 — r 2 ~ r 2 — r' 2 + 2vrr' 


.( iv )- 


This is the full solution of the first problem. 

We see that in order to solve it, it is necessary : 

(i.) To find the correlation p of the homologous pairs as if they 
were simple homotypes. 

(ii.) To find the correlation r between the growth periods of each 
pair of homotypes. 

(iii.) To find the correlation r between the character and the period 
of growth. 

(iv.) To find the correlation r between the character of one homo¬ 
type and the period of growth of its fellow. 

Now these correlations can be found at once by the usual statistical 
processes, if the data are forthcoming. 

(3.) I propose to illustrate this on material, which, although not 
homotypic, is so analogous that it brings out all the important features. 

We will determine the correlation between the head-length of 
brothers, such length being measured on school boys of all ages, from 
4 to 19.* It will be clear that we have here all the difficulties of the 
homotypic problem—resemblance due to common origin obscured by 
differences in the period of growth of each individual. 

Table I gives the correlation of pairs of brothers without regard to 
their differences of age. 

Table II gives the correlation between age and length of head in 
the same individual. 

Tables IIIa and IIIb gives the correlation between the age of one 
brother, and the length of head of the second. 

Table IY gives the correlation between the ages of pairs of brothers. 

These tables have been prepared by taking off from the brother- 
brother data papers of my school measurement records all the avail¬ 
able pairs of cases falling into each series. Thus in some cases the 
ages of both brothers were given, but not the head measurement of 
one or other; in other cases the head measurements of both, but the 
age of one or other would fail, or again the age of one and the head 
measurement of the other might be all the information available. 
Thus the total number of cases and the frequency distribution varies 
slightly from one table to a second. 

* The measurements form part of the material obtained with the assistance of 
a grant from the Boyal Society Government Grant Committee. 
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A few remarks must be made on these tables. 

Table I gives the following values of the constants 

Mean length of head of elder brother = 186*7508 in mm. 

„ „ younger „ = 183*8296 „ 

Standard deviation of elder brother = 7*5027 ,, 

» >, younger „ - 7*3536 „ 

The correlation is, then, found to be 0*601,751,* and the regression, 
younger on elder brother, 0*5897. These give the intensity of heredity,, 
uncorrected, for the growth factor. 

Now, the most noteworthy part of this result is, as we shall see later y 
that taking brothers at different ages tends to exaggerate the apparent intensity 
of heredity. If we were to take pairs of boys at ages from 4 to 19, each 
pair having no hereditary relationship, but being, on the average,, 
within a year or so of the same age, we should find a spurious correlation 
due to the mixture of material, each pair having approximately-like 
head-lengths because the members of it were, approximately, of like* 
age. On the other hand, if the boys were blood relations of very' 
different ages, their apparent relationship would be weakened, because* 
we should be correlating the same organ at different stages of its* 
growth. We have thus two factors : one tending to exaggerate, and! 
the other to weaken the apparent strength of hereditary resemblance.. 
It is of great interest to note that the former factor in the present- 
case is the more effective. 

In Table II we have what I term a growth table, i.e., a correlation 
table between period of growth and the quantitative measure of a. 
character. The constants of this table are as follows:— 


Mean age of boy . = 13 *0394 years. 

Standard deviation of age.. = 2*8207 ,, 

Mean head-length. = 185*4516 mm. 

Standard deviation of head-length = 7*4991 „ 

Correlation of age and head-length = 0*453,496 


The regression coefficient for head-length on age = 1*205676, and! 
we have the probable head-length Hpfor observed age A given by 

H P - 169*7303 + 1*2057 A.. (e.) 

Thus, on the average, boys 7 heads grow in length 1*2 mm. a year. 

My results are based on 1637 cases entirely taken off the brother- 
brother data papers. Dr. Alice Lee at an earlier stage also worked out 
a growth, table. We had not then so many brother-brother data 
papers filled in. She used in addition all the brother measurements 
on the brother-sister papers, and so reached 1856 boys, of which, I 

# Six figures have been kept in the correlation coefficients, as we require to j 
calculate the regression coefficients from the differences of products and powers. 

Y 2 
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think, we may safely assert that 400 at least are not included in my 
series. She found : Mean age,* 12*7177; mean head-length, 184*8182; 
and slope of regression line, 1*2040, giving the formula 

H P = 169*5061+ 1*2040 A, 
a result in substantial agreement with mine. 

Diagram 1. 



Age of Boy. 


In Diagram 1 the formula (e) is represented with the observed 
mean values at each year of life. The results for the 4th, 5th, and 
19th years of life ought not to be considered, for they are based on 
only 2, 10, and 12 observations respectively. It will clearly hardly 
be possible to express the growth curve better than by a straight 

# The mean age is less, because brother-sisters are obtained chiefly from 
primary, not secondary, schools. 
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line, until the range of data is very largely extended. The regres¬ 
sion is sensibly linear. 

Table IIIa and Table Ills give the following results :— 


Mean age of elder brother .. = 14 *1249 
S.D. of elder brother’s age.. = 2 *5124 

Mean head-length, younger 

brother.=183*8578 

S.D. of head-length, younger 

brother.= 7 *2806 

Correlation of age of elder and 
head-length of younger .. = 0 *396,598 

We see accordingly that within the limits of the probable error, the 
correlation between younger brother’s head-length and elder brother’s 
age is the same as that between elder brother’s head-length and 
younger brother’s age. This result might, to some extent, have been 
anticipated, but actual proof of this type of cross-relation is of value. 
In Table IY we have the correlation between ages of brothers giving 
the constants :— 


Mean age of elder brother . = 14* 1508 

Mean age of younger brother . — 11* 7487 

S.D. of elder brother’s age . = 2 * 5080 

S.D. of younger brother’s age . = 2*7220 

Correlation of brothers’ages . = 0*884,186 


Mean ag© of younger brother = 11 *7149 
S.D. of younger brother’s age = 2 *7221 

Mean head-length, elder 

brother. ... =186*6515 

S.D. of head-length, elder 

brother...— 7 *5005 

Correlation of age of younger 

and head-length of elder . = 0 *379,326 


The first four results are in good agreement with those of Tables 
IIIa and IIIb. The last result shows how nearly there is an approxi¬ 
mation to a constant difference in age between brothers in schools. 
Very closely we have— 


Probable age of younger brother = 0*96 x (age of elder brother) - 1*83. 

When the elder brother is 6, his younger brother is probably 2*1 years 
younger than he is ; when the elder brother is 12, the younger brother 
is probably 2*3 years younger, and when he is 18, 2*6 years younger, 
The explanation of this is that when the elder brother is very young 
only his near or second brother will, as a rule, be at the same school, 
but in the secondary schools, which he reaches at a much later age, it 
is possible for a much younger brother to be at the same school. 

Now let us substitute the correlation values, found in equations (i) 
to (iii), of page 290. We have 

r xy = 0*601,654, r* 1<2 = 0*884,186 

r xh = r yh = 0*453,496, 
r xt2 = 0* 379,326, 


r ytx '= 0*396,598. 
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Whence we find 


Z x /<r x = 0*890,051, 'Zy/o- y = 0*891,209, 

and 

R - 0*5446. 

This is a very reasonable value of fraternal correlation, agreeing 
quite well with results obtained for horse, man and dog. It is worth 
noting that 

r xtl x r hh = r yh x r hh = 0*4010, 


and, therefore, either equals r xt , or r ytl fairly closely; in fact, within 
the probable error of their difference. 

Hence, it would appear highly probable that the cross-relation 
between one brother’s head length and a second brother’s age is solely 
due to the correlation of the ages between the two brothers. 

If such a result as 

r %h x r hk = Tx h ... (v) 

ishould be verified on the reduction of further data, it will enable us 
to much simplify our formulae. 

Thus we easily find for this case 


2 , = <r x y/l- 


and 


r xti , 


y 


(Tys/l 




R = 1M. 


^'xy I'xty r ‘\u 


1 “ rtf* 


Or, we require to find only the uncorrected correlation ( p ) the growth 
correlation (r), and the correlation between periods of growth (r). 
The correction to be made to the apparent correlation is then the 
subtraction from it of 

r' 2 (r-p) 

1 -r 2 


I hope shortly to ascertain whether relations like rj above hold also 
for other head-measurements on growing children. 




Table I.—Collateral Heredity Head Length in Brothers, uncorrected for Age. 

Head-length of Elder Brother in mm* 
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* The group 162—3 contains all boys with head-lengths greater than 161*5 and less than 163 *5 and so on. When a head-length fell 
exactly on sucli a value as 183 *5 it was halved between the columns 182—3 and 183—4. 





















Table II.—Correlation of Age and Length of Head in Boys. 

Year of Age of Boy * 
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* All boys from the nth to tlie (» + l)th anniversary of their birthday would be placed in the nth column, or the mean age of such 
boys = n + O’5 years, 
f See Note to Table L 
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his table contains only fifteen pairs of twins, one pair of brothers haring been born within ten months of each other. Any a<*e as 9 means 
all falling within the ninth year, i.e., from the ninth birthday to the day before the tenth birthday, so that two brothers, not twins’ might 
appear of the same age. 3 6 
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(4.) To find the correction to be made to the apparent homotypic correlation , 
when each pair of homotypes is differentiated by a common period of groivth 
from other pairs of homotypes. 

The solution of this problem may be deduced at once from equation 
(iii) of the preceding problem by simply putting ti — U. In this case 
r = r f , r = 1 , and we find 


This equation was given by me in a note in BiometriJca, vol. 1, 
p. 404, and its use illustrated on Dr. Simpson’s data for Paramec¬ 
ium caudatum. 

(5.) To find the correction to be made to the apparent homotypic correlation 
when the pair of homologous parts are differentiated from each other by situa¬ 
tion on the organism . 

We have only to put in formula (iii) on p. 290, t\ = pi and t 2 — 
the positional co-ordinates of the first and second homologous parts, to 
make that formula available for position instead of age differentiation. 
If we denote by C\ and c 2 the characters of the parts in the positions p\ 
and p 2 respectively, our solution takes the form below, where we have 
confined our attention to the same character, 

ll = ____ p( 1 ~ t pyp 2 ) _ 

1 “ T PxV% ~ r P\cf - hue 2 + % r PiPi r PiCi r PiCi 

- % r PiCi r P\Ca ~ r P!P2( r PiU 2 + T P^ 2 ) ^ % < /yj\ 

1 " T P\P 2 ~ r Pi* 2 ~ T P\C 2 + ^ r PiPTv\C\ r P^2 

This follows since = r lw and r PlCz =' f ]Vl . 

We have again, therefore, to find four correlation coefficients. But 
this formula simplifies immensely if we observe the following con¬ 
ditions : 

(a.) Take the same number of homotypes or homologous parts from, 
the same positions in each organism. 

(b.) Enter each one of these homotypes or homologous parts with 
each other on the same organism, so as to obtain a symmetrical table, 
i.e., Ci is first entered with c 2 and then c 2 with ci, 

These conditions are or can be usually satisfied in any homotyposis 
investigation. 

(6.) Further, the positions will, as a rule, be arranged in series and 
may be numbered 1, 2, 3, 4,. .m, if m homotypes or homologous parts be 
taken from each individual organism. The position scale is, of course, 
perfectly arbitrary, and has nothing to do, for example, with the 
actual distances between positions on the organism. We can make it a 
uniform numerical scale, which for convenience we can take to be the 
same serial order as that of positions on the organism. 
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Let p = mean position, o- p — standard deviation of positions on the 
arbitrary position scale. Let there be n organisms, and suppose that 
S w denotes a summation of all m homologous parts on an organism, 
and S a summation for all n organisms. Then, if c r p = o- Pi == <r Pi , 

nm (m - 1) r Pllh a-/ = S» { S m (jp x - jp) -p)} 

= S»{S OT (pi -p) X s m (p 2 -p) - Smifi -p) 2 } 
ButS m(Pi-p) = 0, hence, since S m (pi -p) 2 = mo^ 2 , 

nm (m - 1) r Pl p 2 o-p 2 = - nmo-p 2 , 

or = - - 1 .(vii). 

12 m -1 v 

Further 

w,m(m ~ l)?p lC2 oyr c = S n {S m (p x - p) (c 2 - c)} 

— {S m (^2 — c) x S r/l (pi ~~p) ~ — r) — jo)} 

= -- Sw { Sw, (Cl — c) (jpi—^p)}- 
But S^{S m (ci--a)(^i~jy)} = nmo- c o- p r ClVl . 

Hence 

= = r <w xr Pi ^ 2 . (viii), 

a relation precisely similar to that discovered in the case of growth 
periods for brother’s head-lengths from the actual numbers on p. 294. 
Substituting we find 


1 - W - W - W +^ = (i - W) (i - W) 

2, Wl — r ii<pj( r 'p^\ + r PiC 2 ) — } ''ViC] 2 (1 ~ r V\V‘i) 

Then substituting in (vi) and using (vii) we determine the simple 
formula for homotyposis corrected for positional differentation 


>_ + } V 2 

r p f m — 11 -r p f 


where r pc stands for the correlation of character and position on the 
organism. 

An exactly similar formula might be found for the correction for the 
age or growth factor, if the m homologous parts dealt with had the 
same distribution of ages or growths in each organism. 

(7.) Now the equation just found has the serious disadvantage that 
it is based on the linearity* of the regression relation between position 

* The reader should note that this condition does not involve any assumption 
of normal frequency, or the Gaussian law. The latter applies only to a very 
special case of linear regression. 
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and character. But while organic and homotypic correlations give for 
a surprising variety of cases sensibly linear regression relations, the 
relation between position and mean character is far more rarely linear* 
We obtain, as a rule, remarkably smooth curves. We, therefore, 
require some modification of equation (ix). 

Still supposing the regression of character and position linear, we 
should have, if a be the mean standard deviation of an array of organs 
in the same position, 

o -' 2 = o- 2 (l -r p 2 ). 

But if o- M be the standard deviation of the means of the arrays, we 
have from first principles 

cr 2 = o- m 2 + o-' 2 

Hence o- M = o- x r pc . 

We can now write equation (ix) in the form 


R = p-r 


,+ 


(in ~ i)(o- 2 -cr M 2 ) 


(x). 


This is quite free from r pc , and, what is more, although we have deduced 
it from (ix) and the relation o-' 2 = o- 2 (1 - r pc 2 ) peculiar to linear 
regression, it is now free of any limitation as to the nature of the 
relation between position and mean character. Thus (x) is a far more 
important formula than (ix), and should always be used, until we have 
shown that the relation between position and mean character is 
sensibly linear. If anything, it involves less arithmetic than (ix). 

We can show this ab initio as follows:—Let the individuality of the 
organism in any homologous part be measured by its excess above 
(respectively defect below) the mean value of the character for the 
homologous part in that position.* Then, if c' = element of character 
due to individuality, and c v be the mean character in any position for 
the n individuals dealt with, 

Ci = ci -c Pi S n (ci) = nip, and S n (ci) = 0. 

Hence we easily find 

W 2 ) = S nic^-nlp 2 
S m s n(ci 2 ) = 


# It might bo considered better, if the standard deviations of the homologous 
parts vary very considerably with position, to measure the individuality by the 
ratio of this excess to the corresponding standard deviation. Not only, however, 
does the use of such a ratio immensely increase the arithmetical labour, which is 
a possibility, which of course, we could face, but there is also a question as to 
whether the ratio is really a truer measure of individuality. A full discussion of 
this important point must for the present be deferred. 
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c = S m S n (ci)/(ww) =S m (5p)/m, we have 


where cr' is the standard deviation of the character-individualities free 
from the position factor. We see that it is precisely the same quantity 
as we have previously used for the mean standard deviation of the 
arrays for given positions. 

Next taking the correlation of characters c\ and c' 2 in positions pj 
and we have 


fi 2 ) + S m (ci'ct) = S m (ri 2 ) + S m (cic 2 ) - 2S W (ci) me + m 2 c 2 . 

To get this result we have multiplied every quantity like c\ — C\ - c Pl 
by every other quantity like c' 2 = c 2 - c Pz and by itself, and then added 
such quantities together for every position on the one organism. Thus 
on the left hand side there are m terms in the first, m(m - 1) terms in 
the second summation; on the right hand side there are m terms in 
the first, m(m ~~ 1) terms in the second and m terms in the third sum¬ 
mation. Now sum for each of the n organisms, and we have 

nmer' 2 + nm (m - 1) E cr ' 2 — mn (o* 2 + c 2 ) + nm (m - 1) ( per 2 + c 2 ) 

- 2'm 2 nT? + m 2 nc 2 . 

Whence E = pa 2 /# 2 + , 

or, as before 

n*2 rr 2 

•R = 0 _ T _4._ ° M___ / x \ 

P cr 2 - c r M 2 (m - 1) (cr 2 - cr M 2 ) 

Now while this proof is independent of the theory of partial cor¬ 
relation coefficients, involving only simple algebra, and is further 
independent of any consideration of linear regression, it yet wants 
something of the width of the former theory, which allows us at once, 
for example, to correct for a combination of factors, such for example 
as for both growth and position influences simultaneously. The 
difficulty lies entirely in the extent within which it is legitimate to 
assume the relation between position or age, and the mean value of the 
character at that position or age to be linear. It is therefore clearly 
advisable to start by plotting this relationship,* and fitting, if possible, 
such position or growth graphs with appropriate curves. If, for the 
series of positions dealt with or the period of growth taken, we find that 
a straight linef is a close approximation to the relationship, then we 

# In the case of some animals and many plants the relationship is in itself 
of mnch interest, for it expresses a law of development or growth in serial parts. 

f The analytical consideration of this point is very simple. If the regression 
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may use the general theory of partial correlation, otherwise we must 
fall back on results like (x). Por example, in head growth in boys, 
we cannot much improve on a straight line; in positional influence on 
the branches in the whorls of Equisetum arvense we need at least a third 
order parabola. 

(8.) Although material for several investigations on the homo- 
typosis of serial homologous parts has been collected, the progress in 
some of these cases is slow, as it involves rather laborious micro¬ 
scopic measurement. I content myself at present with an illustration 
from the vegetable kingdom. 

I collected in the autumn of last year, 126 plants of Equisetum arvense 
in Pay dale Side, an offshoot of Wensley Dale; the plant was growing 
on a lane side high up above Semmerwater. This Equisetum grows from 
the top with a single stem, and I counted the number of branches to 
the whorl from the root upwards. As a rule, there will be one or 
two whorls close to the soil which have never developed any branches 
at all; then we have what I shall term the first whorl in which some 
branches have developed, but the number is irregular and obviously 
subject to some cause of variation, other than the growth law of the 
plant. The number of branches to the whorl then increases uniformly 
and steadily up to the 4th whorl, after which it falls almost equally 
steadily to the 10th whorl. Beyond this the results becomes somewhat 
irregular again, for very few plants will be found—at any rate in the 
locality considered—with more than 12 or 13 whorls, and even in these 
whorls there is a certain amount of forking or irregularity which it 
is difficult to deal with. The plants were certainly fully developed 

be linear, the means of the arrays all lie on the regression line, and the mean 
standard deviation of the arrays about their means is crVl — r 2 . If the regression 
be not linear, the means of the arrays will have a mean square deviation s_m 2 from 
the regression line. The mean square deviation of the arrays from the regression 
line, but not from their means, is still <r 2 (l — r 2 ). The mean standard deviation 
(deviation of mean square from means ) is now given by 

<r' 2 = <r 2 (1 - r 2 ) - (cr M 2 - rV 2 ), 
since a 2 = <r /2 + orM 2 * But we easily find 

2 m 2 = (tm 2 —rV. 

Hence 2 M is a good measurement of the deviation of regression from linearity, or 
of cr M from rc r. If we take rj = cr M j or, we have 

o- /2 - <r 2 (l-r), 2 m 2 = { n 2 ~r 2 )<? 2 . 

Clearly rf must lie between r 2 and 1. Further, jj can only vanish when the cor¬ 
relation is zero, or become =hl when the correlation is perfect. Between these 
values it gives the mean reduction in variability of an array as compared with the 
whole population. Further, the deviation of r/ from r is a good measure of the devia¬ 
tion of the system from linearity. Thus rj is a useful constant which ought always 
to be given for non-linear systems. It measures the approach of the system not 
only to lineality but to a single valued relationship, i.e., to a causal nexus. 
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when gathered at least as far as the 12th or 13th whorl, and I doubt 
whether even beyond this so late in the season, any further branching 
would have taken place. A few branches were broken off, and these 
were of course counted; there was no difficulty, however, in easily 
ascertaining whether a branch had in any case been developed or not, 
and the peculiarity of the 1st whorl was certainly not due to missing, 
but to undeveloped branches. 

Table Y gives the relation between branches to the whorl and 
position for the whole of the 126 plants. In two columns to the right 
are given the means and variabilities of the branches for each whorl. 

Now, whether we judge by mean or standard deviation, we see 
a perfectly gradual change from whorl to whorl, which absolutely 
precludes us from considering the number of branches to the whorl 
as a pure homotypic character. We see a marked differentiation due 
to position of the whorl on the plant; the whorls are homologous but 
not homotypic parts. 

Suppose, however, that we disregard our test for differentiation,* 
and proceed to find a correlation table for the whole material as 
homotypic. We have Table VI, for which I have to heartily thank 
Dr. Alice Lee. 

The value found for the homotypic correlation from this table is 
P = -0*0064±0-0185, 

or, there is no sensible homotyposis at all. 

But we might have gone to the other extreme and taken only the 
3rd, 4th, and 5th whorls, which have more nearly the same means and 
standard deviations as homotypes. The result is Table VII, giving 

P = 0*7918 ± 0*0129. 

It will be perfectly clear, therefore, as these two results ought to be 
the same, if the whorls were true homotypes, that we may get any 
result at all if we neglect differentiation.! The answer to this is that 
no trained biometrician would call these whorls “ undifferentiated 
like organs ” with the two right hand columns of Table V before him. 


* On the test for differentiation, see ‘ Biometrika/ vol. 1, p. 334. 
f Bateson, * Boy. Soc. Proc.,* vol. 69, p. 200. 
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Table VII.—-Whorls, 3rd, 4th, and 5th only. 


Number of Branches to 1st Whorl of Bair . 




7. 

8. 

9. 

10. 

11. 

12. 

13. 

Totals. 

© 
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28 
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1 
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__ 
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46 

1^ 
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28 
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146 
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26. 

6 
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58 

206 
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26 

196 

35 
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110 
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__ - 
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_ 
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10 

_ 
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g se 

13 

__ 

__ 
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| <M 










Totals 

46 

58 

206 

262 

162 

! 20 

2 

756 

| 


Now let us consider how to handle the material, allowing for the 
differentiation of the whorls. To begin with, our formula requires; 
the use of the same number of homologous parts for each organism, 
and it is, on account of the value of the probable error of the random 
sample, undesirable to use fewer than 100 individuals. This leads to 
our cutting off Table V at the 10th whorl. In this way we get rid 
also of the forking, which certainly begins in many individuals at the 
11th or 12th whorl. Table VIII gives us the data of Table V recon¬ 
stituted for 110 plants, with ten whorls apiece. The only serious 
difficulty now remaining is that which I have referred to as arising 
from heterogenity in the first whorl. A glance at the mean and 
standard deviation of the branches in the first whorl given in Table V 


Diagram 2. 



Equation: y= 9-451,443 - -549,4302 oc - -179,988! oc z ~ -008,4206 tc 3 . 
Degression curve. —Whorl branches and position. 










Table VIII.—Eelation between Number of Branches and Position of Whorl. 
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will suffice to demonstrate this heterogeneity. Certain individuals have 
the normal number of about 8*5 branches to this whorl, but about a fifth 
of the total number of individuals only develop about half this normal 
number of branches. To illustrate this I have in Diagram 2 plotted 
the mean number of branches to the whorl, and fitted these means 
with a parabola of the third order,* using only whorls 2 to 10. The 
equation to this parabola is 

y = 9*451,443 - 0*549,4302;r - 0T79,988l£ 2 - 0-008,4206^, 

the origin being at the 6th whorl, and y giving the mean number of 
branches for x whorls from the 6th. We have the following results :— 

Observed number of 


Position of whorl. 

1 

branches. 

7-718 

Calculated. 

8-752 

2 

9-327 

9*308 

3 

9-718 

9*707 

4 

9-827 

9*898 

5 

9-746 

9*829 

6 

9-564 

9*451 

7 

8-891 

8*714 

8 

7-491 

7*565 

9 

5-736 

5*956 

10 

3-964 

3*835 


A much worse fit was obtained by striking a cubical parabola through 
all ten points. 

It will be seen that the excellency of fit fully justifies the use of 
this curve. But that there is a large deviation from the observed 
mean of the 1st whorl, when we calculate its value from the curve 
thus obtained. Somewhat reluctantly, therefore, I felt compelled to 
omit the consideration of the 1st whorl from my investigations. Had 
I possessed a sufficient number of specimens I should have separated 
my material into two classes, those plants with normal 1st whorl and 
those with abnormal 1st whorl. But with my available material I 
should have had considerably less than 100 individuals to deal with, and 
accordingly I settled to take nine homologous parts only, namely, the 
2nd to the 10th whorls, in which the differentiation appears to be solely 
due to position on the plant. Above the 10th whorl, the phenomenon 
of forking obscures the determination of branches to the whorl, while 
below the 2nd wliorl the full or partial development of branches to 
the whorl seems to be determined by the local lower vegetation 
round the stem. 

Taking Table VIII, I found for the mean of the means 8*2515 
branches, and for the standard deviation of the means <r M , <r M 2 = 
# By the method indicated in 4 Biometrika,’ vol. 2, p. 11. 
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3,938,354. Further, if cr be the standard deviation of the frequency 
distribution of branches, as found from the bottom row of Table VIII, 
we have 

o- 2 = 6-721,083. 


Hence for use in formula (x) we have, since m = 9, 


2-415,280, 


1 ^M 2 

m — 1 (T 2 — cr™ 


0-176,911 


, (xi)» 


Table IX gives the uncorrected homotyposis for the nine whorls 
treated as simple homotypes. From this I find, 

p = 0-131,258 . (xii). 

Substituting (xi) and (xii) in (x), we find for the homotyposis of the 
number of branches in the whorls in Equisetum arvense , when corrected 
for differentiation due to position, 


R = 0-4939. 


This result it must be admitted is extremely satisfactory, and indi¬ 
cates how it is quite possible to correct a result like (xii) by allowing 
for the differentiation of the homologous parts due to serial position.* 
I hope before long to publish other results dealing with homotyposis 
in serial parts, where the differentiation has every variety of intensity. 
I think they will suffice to show that differentiation is not a subtle 
and evasive quality beyond the appreciation of the naturalist who is 
provided with the training requisite for modern biometric research. 

(9.) The values of R as given by (ix) and (x) may be illustrated 
from the actual numbers for Equisetum arvense . We have seen in the 
footnote, p. 304, that 

V = °m l<r- 

This in our case gives 

v = 0-76549. 

But by direct calculation on Table VIII, using whorls 2 to 10, 
Dr. Lee finds r pc = -0-64616. Hence with the notation of the 
footnote referred to 

o-' - 0-7632 o-, - 0-4104 cr. 


* The value obtained for the crude homotyposis of the members of the whorls 
in Asperula odorata in my first memoir was p = 0T733 (* Phil. Trans.,’ A, vol. 197, 
p. 326). I have little doubt that when we are able next summer to calculate the 
correction for differentiation in position of whorl, we shall find E for woodruff in 
good accordance with other homotypic results. My remarks about it were : “ In 
counting the members on the whorls I soon found evidences of differentiation in 
position, the whorls towards the top of the spray having, as a rule, fewer members 
than those lower down ” ( loc. cit. y p. 325). Unfortunately I have not kept my 
records of position. 






Table IX.—Uncorrected Homotyposis for Equisetum arvense . 


312 


Prof. K. Pearson. Mathematical Contributions [Jan. 20, 


Totals. 

(M^fMODlMON^NOOOCDCO 

m(M(M!M(M^ 0OOCOO5OH 
r— 1 rH rH 

7920 

990 

CO 

rH 

| | | rH | J 04 rH rH 04 rH | | 

00 

rH 

04 * 

IQ 1 04 | CO CO 04 Cl rH CO I 

CD 

i> 

rH 

1 1 oq oq CO 00 I 

CO 

rH 

rH 


OCO^NQOONJCHb.QOOTH 

o 

o 


HHHdCOTfUOOSNOiOlW 

■H 

CO 

rH 

H H CO 

o 

rH 

rH 


Ol^CDOlxOCOClClTTiOl^rHCvj 

00 

CD 

6 

COJOLO^IOHCOCDONOIN 

CD 




Cl 

oq 



rH 



1>IOOHOO?CH(M(M^HC35H 

oq 

Cl 

ci 

C04>iQvOI>a>CO'^CD©O4 

x> 

o 

H rH lO CO H 

CD 

oq 

00* 

! 

NC5COi.OCJO)OtHN010(MH | 
CJ(MC0C0ClC0I>^^COO5(M 


CO 

04 

O 


rH rH rH j 

00 

rH 


C5HHOHCOOO^OJ>CO!M i 

<M 



H(MWCCCCCOH1>COCCIO 
tH rH tH 

4> 

CD 

00 


vQCOOqqQCOOqoOClCDCOOH’ ! 

O 

O 


rH rH rH H n CO CO 05 H rr 1 

rH 

o 

VO 

1 

VO 

TPcivoi>cDoorH04Ciiooocc | 

oq 

1 ^ 

H CO (M X> UO CO 1 

i> 

oq 

CO j 

- 

! 

O CO cc ^ j , VO O >0 CO 05 N 1 rH 1 

| GO 



rH rH rH CO CO 40 04 | j 

_ ___ __ .. | 

oq 

l 

CO 


Hl>C0C0i0(NHC001C0^<M 1 

04 

Cl I 

CO 

rH nN WOOH 1 

CO 

oq j 

_ 1 


04 

| 


00-^X>C0C100rHClVO^C0 I | 

CD 

04 

04 

rH r—l r-i 04 04 4>» IQ rH 1 | 

1 

VO 

oq 

CO 

i 

(MQOHO^iOCNb-OlOO ! 

1 

oq 


rH 

rH h rH rr rH 04 CO CO rH I 

Cl 

rH 

oq 



I 

oo 


HOJCO'fOCDJNOOOSOHOJM 

m 

C © 


rH rH rH rH 

1 

""cS 

O 

H 

3 Ifc 
cr< r 

£ ° 


'MVtf fo iJLo yM V U Z u l sdyouvjg Jo Jdqwnjg 








1903 .] to the Theory of Evolution. 313 

Thus 7] diverges much from r pc and 2 M from zero. Indeed, a glance at 
the diagram shows how far we are from true linear regression. If 
we use r pc as above instead of rj, i.e., (ix) instead of (x), we have 

E = 0-3511, 

a value very much below the actual value. This illustration will 
suffice to emphasise the importance of testing the actual curve of 
regression before we assume it to be linear and use equation (ix). 

(10.) The subject of differentiation due either to position or age is, 
of course, a difficult one, but it does not seem at all beyond biometric 
treatment. The greatest difficulty which it seems to me will have to 
be encountered is not that of discovering and allowing for differentia- 
tion due to serial position, but in ensuring that when this has been 
allowed for, there is not remaining an organic correlation due to the 
necessit}^ of adjacent parts “ fitting.” On this account it is most 
desirable that as large a number of homotypes as possible shall be 
taken, so that the part of the correlation due to the homologous parts 
having to fit, or, indeed, to serve a common, end, should be reduced to 
as small a quantity as possible. For example, if we suppose adjacent 
whorls to have their number of branches influenced by an organic 
relationship, this result will only bias nine out of the forty-five pairs we 
should form in dealing with, ten whorls. The question of separating 
organic from homotypic correlation is one that I hope to return to at 
a later date. Meanwhile the present paper will suffice to indicate how 
partial correlation coefficients enable the biometrician to free himself 
from the differentiation between individuals due to different periods 
of growth, or to different positions on the organism. 

In conclusion I should like to thank Dr. Alice Lee and Mr. F. E. 
Lutz for aid readily granted at one or other stage of this investiga¬ 
tion. 



